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Theory of Change 


The critical importance of mathematics 
has garnered increased attention in the past 
decade (National Mathematics Advisory Panel 
[NMAP], 2008; National Research Council 
[NRC], 2001). The most recent National As¬ 
sessment of Educational Progress (NAEP) re¬ 
sults classified 58% of fourth-grade students 
as failing to reach proficiency in mathematics 
and 17% as falling below basic achievement 
patterns on the NAEP; the results are even 
more disconcerting when examined by in¬ 
come, ethnicity, and disability status (National 
Center for Education Statistics, 2013). Young 
students without a deep understanding of 
mathematics risk losing access to more ad¬ 
vanced mathematics including algebra 
(NMAP, 2008) and long-term career opportu¬ 
nities available in the fields of science, tech¬ 
nology, mathematics, and engineering (Na¬ 
tional Science Board, 2008). The National 
Council of Teachers of Mathematics (2013) 
noted that “an economically competitive soci¬ 
ety recognizes the importance of mathematics 
learning to adult numeracy and financial liter¬ 
acy, and it depends on citizens who are math¬ 
ematically literate” (p. 1). With recognition of 
the negative impact of low mathematics 
achievement at both the individual and na¬ 
tional level, urgent calls from the highest lev¬ 
els of the federal government have been made 
for an increased focus on improving the math¬ 
ematics outcomes of our nation’s students 
(Obama, 2013). 

Occurring simultaneously with lower 
than desired levels of mathematics achieve¬ 
ment is a growing recognition that a successful 
start in mathematics is critical in ensuring 
long-term success. Morgan, Farkas, and Wu 
(2009) analyzed longitudinal data from the 
Early Childhood Longitudinal Study database 
and found that of the students who entered and 
exited kindergarten below the 10th percentile, 
70% remained below the 10th percentile in 
fifth grade. In contrast, of the students who 
entered kindergarten below the 10th percentile 
but exited above the 10th percentile, only 30% 
were below the 10th percentile in fifth grade. 
In other words, those students who came into 
kindergarten at an elevated risk for math dif¬ 
ficulties but grew substantively over the 


course of the year were markedly less likely to 
be at risk up to 5 years later. These trends 
found in longitudinal data sets of mathematics 
achievement mirror those found for the devel¬ 
opment of reading trajectories (Juel, 1988). 
Such findings in the area of reading develop¬ 
ment spurred a focus on prevention of reading 
difficulties through the use of screening sys¬ 
tems to identify at-risk students (Good, Gruba, 
& Kaminski, 2002) and the development of 
curriculum materials targeting foundational 
reading skills (Wanzek & Vaughn, 2010). A 
similar system, based on the idea of prevent¬ 
ing mathematics difficulties before they fully 
develop by identifying at-risk students and 
providing early intervention services targeting 
key foundational skills, has been advocated in 
mathematics (Fuchs, Fuchs, & Compton, 
2013). 

The focus on prevention of mathematics 
difficulties fits within the context of service 
delivery in schools based on a tiered model of 
instruction commonly referred to as response 
to intervention (Rtl; National Association of 
State Directors of Special Education, 2006). 
Though originally conceptualized as a proce¬ 
dure to evaluate eligibility for special educa¬ 
tion services (Individuals with Disabilities Ed¬ 
ucation Improvement Act, 2004), in practice 
Rtl has been implemented as a more robust 
system of support to increase the achievement 
of all students (Fuchs, Fuchs, & Zumeta, 
2008; Vaughn & Fuchs, 2003). The shift in 
conceptualization has placed a tighter focus on 
the instructional supports provided to students 
at different levels of need, including the in¬ 
struction provided as part of the core class¬ 
room experience (i.e.. Tier 1) and additional 
instructional support (i.e.. Tiers 2 and 3) pro¬ 
vided to students who do not respond to re- 
search-based core instruction. Rtl systems 
have, in some respects, become standard in 
reading (Vaughn, Wanzek, Woodruff, & 
Linan-Thompson, 2007), whereas in mathe¬ 
matics key Rtl components require further in¬ 
vestigation (Bryant et ah, 2011) to meet the 
need for research- and evidence-based pro¬ 
grams (Glover & DiPerna, 2007) vital to any 
Rtl system. 
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Current Research on Tier 2 
Mathematics Intervention Programs 

Any call for improved mathematics 
achievement within an Rtl model is dependent 
on educators being able to access state-of-the- 
art curricular programs designed to address the 
specific needs of students attempting to gain 
access to mathematics content (Clarke, Baker, 
& Chard, 2008). The current research base in 
mathematics, while expanding, lags behind the 
field’s knowledge base regarding reading in¬ 
struction (Gersten et ah, 2009). We reviewed 
78 elementary math programs that had been 
evaluated by the What Works Clearinghouse 
(WWC) and found that only 7 of them were 
evaluated by studies that met the WWC stan¬ 
dards, with only 4 programs showing potential 
positive effects on student achievement. Two 
concerns are raised by this review. First, fewer 
than 10% of programs had been evaluated 
with research designs of sufficient rigor to 
enable conclusions to be drawn regarding their 
impact. Second, of the four programs showing 
a potentially positive impact, all four were 
core (Tier 1) programs and not designed spe¬ 
cifically for use with at-risk students. 

A second analysis found that only nine 
intervention studies had been conducted on 
programs suitable for use as Tier 2 programs 
in an Rtl model (Newman-Gonchar, Clarke, & 
Gersten, 2009). Of those nine studies, only 
two were designed for use with first-grade 
students. In the first study (Fuchs et ah, 2005), 
a randomized controlled trial design was used 
to test the efficacy of a 63-lesson program. 
Number Rockets, on mathematics achieve¬ 
ment. In each lesson, students received 30 min 
of small-group instruction on 17 key number 
concepts and then 10 min of computer-based 
instruction focused on increasing procedural 
fluency on mathematics facts. The results in¬ 
dicated a significant impact on three major 
areas of mathematics understanding—(a) 
computation, (b) concepts and applications, 
and (c) story problems—with effect sizes 
ranging from 0.11 to 0.70. An impact was not 
found on student fact fluency performance. A 
subsequent large-scale replication study eval¬ 
uating Number Rockets was conducted in four 


states (Rolfhus et ah, 2009), with similar but 
more moderate results, with an effect size 
of 0.34 on the Test of Early Mathematics 
Ability, Third Edition (Ginsburg & Baroody, 
2003). In the second of the studies reviewed 
(Newman-Gonchar, Clarke, & Gersten, 2012), 
Bryant, Bryant, Gersten, Scammacca, & 
Chavez (2008) used a less rigorous regression 
discontinuity design to examine the impact of 
a small-group intervention program targeting 
key mathematical concepts, as well as number 
concepts and relationships, such as base 10 
and place value. On average, 64 lessons were 
completed across 18 weeks. The study did not 
find a significant impact on either a proximal 
or distal measure of student achievement. It 
should be noted that in both studies, the focus 
was on evaluating impact, and the authors did 
not address the role of potential mediators of 
student outcomes. 

With recognition of the need for ex¬ 
panding the research base on effective math¬ 
ematics instruction, a number of seminal doc¬ 
uments (Gersten et ah, 2009; NRC, 2001) in¬ 
cluding the NMAP (2008) report have 
explicitly called for the development and rig¬ 
orous evaluation of mathematics curricula. 
Thus research on core programs (programs 
designed for and used at the whole-classroom 
level) and intervention programs (Tiers 2 and 
3) provided in the early elementary grades is 
critical. 

Theory-of-Change Models in 

Curriculum Development and 
Evaluation 

Foundational to the development of re- 
search-based curricula are frameworks that 
link curriculum development efforts to an un¬ 
derlying theory of change. Clements (2007) 
noted that “developers must draw from exist¬ 
ing research so that what is already known can 
be applied to the anticipated curriculum” (p. 
37) and, in turn, developers must “structure 
and revise the nature and content of curricular 
components in accordance with models of 
children’s thinking and learning in a domain” 
(p. 37). 
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Intervention Components Mediator Proximal Outcome Distal Outcome 



Figure 1. Fusion theory of change. 


When using a theory of change, devel¬ 
opers need to justify their predictions by draw¬ 
ing from the relevant theoretical and empirical 
knowledge bases. For example, researchers in¬ 
terested in developing an early mathematics 
curriculum will review existing research in¬ 
volving interventions for students with or at 
risk for mathematics difficulties. A strong the¬ 
ory of change also has roots in relevant theo¬ 
ries of learning (e.g., Bransford & Donovan, 
2005). By systematically grounding an inter¬ 
vention in the learning sciences, researchers 
are able to provide theoretical alignment be¬ 
tween children’s thinking and learning of 
mathematics and the instructional techniques 
embedded within an intervention (Clements, 
2007). A strong theory of change also ad¬ 
dresses the variables hypothesized to mediate 
and moderate the impact of an intervention. 
As Rothman (2013) observed, “Mediators and 
moderators are the building blocks of theory 
and, in turn, intervention design, specifying 
the connections between these two classes of 
constructs is at the heart of developing, test¬ 
ing, and refining theory” (p. 190). Mediating 
variables refer to the processes that comprise 
an intervention, whereas moderating variables 
are student and teacher factors that may po¬ 
tentially change the relationship between an 


intervention and student outcomes. Establish¬ 
ing mediators and moderators in a theory of 
change offers researchers the opportunity to 
unpack the “black box” of classroom instruc¬ 
tion by ascertaining whether an intervention is 
more effective under certain conditions or 
with a particular subgroup of the student pop¬ 
ulation (MacKinnon & Luecken, 2008). 

Fusion’s Theory of Change 

Our efforts to develop and evaluate a 
research-based first-grade mathematics inter¬ 
vention curriculum, Fusion, were guided by an 
underlying theory-of-change model. As de¬ 
picted in Figure 1, the theory of change for the 
Fusion intervention is composed of three key 
levels: (a) intervention components, (b) medi¬ 
ator variables, and (c) proximal and distal stu¬ 
dent outcomes. The Fusion intervention con¬ 
tains two key components: whole-number 
content and explicit and systematic instruc¬ 
tional design principles. When carefully inte¬ 
grated, these intervention components are ex¬ 
pected to facilitate instructional interactions 
between teachers and students around founda¬ 
tional whole-number concepts and skills. We 
hypothesize that the quality of these instruc¬ 
tional interactions will mediate students’ prox- 
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imal outcomes (labeled conceptual under¬ 
standing and procedural fluency in our theory 
of change). It is hypothesized that these two 
proximal outcomes will have a direct impact 
on student mathematics achievement, which 
is labeled as a distal outcome in Figure 1. In 
the following section, we describe each level 
of Fusion’s theory of change and summarize 
the importance of each in the relevant 
literature. 

Intervention Component: Whole- 
Number Content 

Fusion’s first intervention component 
attends to the calls from expert panels 
(NMAP, 2008; NRC, 2001, 2009) for early 
mathematics curricula to have greater focus on 
the critical aspects of whole numbers, often 
referred to as number sense. Proficiency with 
number sense is essential for students’ overall 
academic success throughout public school 
and the opportunities they have for meaningful 
postsecondary experiences (Morgan et al., 
2009; NRC, 2001). A growing body of evi¬ 
dence suggests that many children, particu¬ 
larly children from economically and educa¬ 
tionally disadvantaged backgrounds, do not 
possess a firm number sense and thus struggle 
with making quantitative comparisons, manip¬ 
ulating numbers and their operations, and 
grasping the connection between mathemati¬ 
cal concepts and numerical relationships (Ger- 
sten & Chard, 1999). Although the definition 
of number sense varies among educational 
researchers and mathematicians (Berch, 
2005), there is general consensus that early 
number sense leads to the automatic use of 
foundational math skills, such as completing 
written calculations and solving applied prob¬ 
lems (Gersten & Chard, 1999; NMAP, 2008; 
NRC, 2001). In first grade, foundational attri¬ 
butes of number sense identified in the Com¬ 
mon Core State Standards (Common Core 
State Standards Initiative, 2010) include 
place-value concepts, number combinations, 
multidigit problems involving addition and 
subtraction, and word-problem solving. 


Instructional Component: Explicit and 
Systematic Design Principles 

The second intervention component of 
Fusion is the incorporation of explicit and 
systematic instructional design principles to 
support students’ development of mathemati¬ 
cal proficiency. A consistent finding of empir¬ 
ical research is that explicit mathematics in¬ 
struction has significant value for students 
struggling with mathematics (Baker, Gersten, 
& Lee, 2002; Gersten et al., 2009). For exam¬ 
ple, in a meta-analysis of 41 studies involving 
students with math disabilities, Gersten et al. 
(2009) reported that explicit instruction had a 
substantively important positive effect (Hedg¬ 
es’s g = 1.22) on student mathematics 

achievement. Explicit instruction is a struc¬ 
tured delivery approach that promotes learning 
mastery in the foundational concepts and skills 
of early mathematics. According to experts in 
the field, an early mathematics curriculum is 
considered explicit when it supports teachers 
in (a) introducing new and complex math con¬ 
tent through unambiguous explanations and 
demonstrations, (b) facilitating frequent op¬ 
portunities for students to practice with impor¬ 
tant mathematics content, and (c) providing 
timely academic feedback to confirm correct 
student responses and address potential mis¬ 
conceptions (Archer & Hughes, 2010; Doabler 
et al., 2013; Gersten et al., 2009). 

As with explicit design principles, re¬ 
search has also shown the importance of sys¬ 
tematically designing mathematics instruction 
for students with difficulties in mathematics. 
Systematic design principles attend to the way 
academic information, such as math concepts 
and skills, is selected, prioritized, and orga¬ 
nized within and across a curriculum’s lessons 
(Coyne, Kame’enui, & Camine, 2011). For 
instance, a systematically designed curriculum 
will judiciously interweave and appropriately 
match visual representations of mathematics, 
such as place-value blocks, with abstract sym¬ 
bols to illustrate solution methods for math 
problems. A growing body of research shows 
that this concrete-representational-abstract 
approach supports students in formulating 
“well-developed knowledge packages” (Ma, 
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1999, p. 113) of fundamental mathematics 
(Gersten et al., 2009; Witzel, Riccomini, & 
Schneider, 2008). 

Mediating Variables 

The instructional interactions that take 
place between teachers and students around 
critical mathematics content are a defining 
characteristic of effective classroom instruc¬ 
tion and, we hypothesize, mediate student out¬ 
comes. Classrooms in which these critical 
teacher-student interactions occur at a higher 
rate would have greater student outcomes re¬ 
garding critical early mathematics concepts. 
The purpose of such teacher-student interac¬ 
tions is to facilitate meaningful opportunities 
for students to interact with and practice im¬ 
portant mathematical concepts, skills, and pro¬ 
cedures. The frequency of practice opportuni¬ 
ties has important implications for promoting 
students’ success in early mathematics, and 
findings from recent research suggest that a 
critical format of student practice is mathemat¬ 
ical verbalization (Doabler et al., 2013; Ger¬ 
sten et al., 2009). Verbalizations offer oppor¬ 
tunities for students—both specific individu¬ 
als and the group at large—to communicate 
their mathematical thinking and understand¬ 
ing. In the early grades, math verbalizations 
can be a critical mode of student responding 
because they allow all students the opportunity 
to learn and participate. For example, a teacher 
can facilitate an entire class of students in 
explaining their solution methods for solving a 
multidigit addition problem. 

Proximal and Distal Outcomes 

Mathematics proficiency is composed of 
two knowledge forms: conceptual understand¬ 
ing and procedural fluency (NRC, 2001). Con¬ 
ceptual knowledge refers to an understanding 
of the relationship between representations of 
math concepts and abstract symbols, whereas 
the latter knowledge form entails automaticity 
of math procedures (Wu, 1999). Educational 
research has consistently shown that at-risk 
learners have difficulty making a connection 
between these two knowledge forms (Gersten 
et al., 2009). Therefore, in our theory of 


change (Figure 1), conceptual understanding 
and procedural fluency represent two proximal 
outcomes targeted by the Fusion intervention. 
We hypothesize that Fusion will support stu¬ 
dents’ development of these two knowledge 
forms concurrently. That is, as Fusion helps 
students build understanding of math con¬ 
cepts, it will increase their fluency in solving 
math problems through strategically planned 
opportunities for guided and independent 
practice, as well as cumulative review. Fur¬ 
thermore, we hypothesize that the reciprocal 
relationship between conceptual understand¬ 
ing and procedural fluency will have a direct 
impact on students’ overall mathematics 
achievement. 

Purpose 

The primary purpose of this randomized 
controlled trial pilot study is to test the impact 
of a first-grade intervention program. Fusion, 
on the achievement of students at risk in math¬ 
ematics. There is an intensive need for rigor¬ 
ous efficacy trials of first-grade mathematics 
interventions as evidenced by the paucity of 
current research in the area (Gersten et al., 
2009; WWC, 2013) and by calls for focused 
efforts on the development of intervention 
programs (Gersten et al., 2009; NMAP, 2008). 
We hypothesize that students in the Fusion 
condition will have greater student achieve¬ 
ment outcomes. In addition, given that previ¬ 
ous studies have focused exclusively on stu¬ 
dent outcomes, a secondary purpose is to be¬ 
gin exploring the underlying mechanisms that 
guide the design of intervention programs and 
potentially mediate the impact of intervention 
programs. A direct examination of mediation 
specifically requires showing that a given con¬ 
dition accounts for differences in implementa¬ 
tion across conditions. 

In this study, we were unable to conduct 
mediation analysis because we did not have 
implementation data from control classrooms. 
To attempt to navigate this barrier, we exam¬ 
ined associations between implementation 
quality and student achievement gains within 
treatment classrooms. Because the Fusion pro¬ 
gram is scripted to ensure high degrees of 
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critical teacher and student behaviors, we hy¬ 
pothesize that higher levels of implementation 
will result in a greater degree of teacher- 
student interactions and greater student out¬ 
comes. That is, teachers who teach with an 
overall level of high quality and implement the 
program with high levels of fidelity will be 
engaging with their students in the types of 
behavior hypothesized to mediate student 
achievement in our theory of change. For ex¬ 
ample, Fusion contains language that prompts 
teachers to ask individual questions of partic¬ 
ipating students; thus, if a teacher is imple¬ 
menting Fusion with a high level of fidelity, 
we expect to see more frequent teacher-stu¬ 
dent interactions as students respond to indi¬ 
vidual questions. The work in this study stands 
to contribute to the limited body of research on 
first-grade mathematics interventions, and by 
examining results within the context of a the- 
ory-of-change model, findings from the study 
may contribute to the growing knowledge base 
on effective mathematics instruction within an 
Rtl model of system delivery. 

Method 

Participants 

The study took place in nine schools 
with approximately 10 eligible students per 
school, based on screening scores and teacher 
recommendations. The research team ran¬ 
domly assigned these 10 students to interven¬ 
tion (Fusion instruction) or a control (standard 
district practice) by using a random number 
generator and assigning the lowest five to in¬ 
tervention. The final sample included 89 stu¬ 
dents: 44 in the intervention group and 45 in 
the control group. Control students did not 
receive Fusion instruction but were not pro¬ 
hibited from receiving standard district inter¬ 
vention services. All participants received 
standard classroom mathematics instruction. 

Schools. The schools were drawn from 
two suburban school districts in the Pacific 
Northwest. District A (five schools) 
had 10,796 students: 33% were minorities, 6% 
were English-language learners, 60% were el¬ 
igible for free/reduced-price lunch, and 15% 


received special education services. District B 
(four schools) had 5,866 students: 28% were 
minorities, 3% were English-language learn¬ 
ers, 55% were eligible for free/reduced-price 
lunch, and 17% received special education 
services. The schools were from research part¬ 
ner districts in an Institute of Education Sci¬ 
ences development grant. District staff re¬ 
cruited schools within their district interested 
in participating. 

Students. All hrst-grade students com¬ 
pleted group-administered versions of the 
Quantity Discrimination (QD) and Missing 
Number (MN) measures. The group-adminis¬ 
tered QD and MN measures were modified 
versions of individually administered QD and 
MN measures (Clarke & Shinn, 2004). Raw 
scores on the screener were converted to z 
scores and averaged. The 10 lowest scoring 
students on the screener per school not meet¬ 
ing the exclusion criteria were identified and 
eligible for the study. We excluded students if 
they could not identify or write numbers 1 
to 10 or if they had severely limited English 
proficiency (based on the judgment of the stu¬ 
dent’s primary teacher). Demographic infor¬ 
mation for the sample is shown in Table 1. 

Interventionists. Nine district employ¬ 
ees (i.e., interventionists) taught one small Fu¬ 
sion group each. Interventionists were in¬ 
cluded in the study based on time and schedule 
availability. All of the interventionists were 
women. One was a high school graduate, two 
had bachelor’s degrees, and six had master’s 
degrees. On average, the interventionists 
had 8.7 years’ teaching experience (range, 
3-25 years), 7.4 years’ experience teaching 
math (range, 3-25 years), and 7.7 years’ ex¬ 
perience teaching first grade (range, 4-20 
years). 

Measures 

Fidelity of implementation. Each Fu¬ 
sion lesson consisted of at least three primary 
activities. Observers rated implementation fi¬ 
delity for the first three primary activities in a 
Fusion lesson using a 0-1 scale (0, not taught; 
0.5, partial implementation; and 1 , full imple- 
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Table 1 

Descriptive Information for Demographic Characteristics by Condition 

Demographic Characteristic 


Treatment 
(,n = 44) 

Control 
(n = 45) 

X 2 (1 df) 

p Value 

Male [n (%)] 


21 (47.7) 

29 (64.4) 

2.53 

.112 

Nonwhite [n (%)] 


4(9.1) 

9 (20.0) 

2.12 

.145 

Hispanic [/; (%)] 


6(13.6) 

12 (26.7) 

2.34 

.126 

Free/reduced-price lunch [n (%)] 


31 (70.5) 

31 (68.9) 

0.03 

.872 

English-language learner \n (%)] 


6(13.6) 

10 (22.2) 

1.11 

.292 

Eligible for special education services 

[n (%)] 

13 (29.5) 

11 (24.4) 

0.29 

.588 


mentation ). A fidelity score for each observa¬ 
tion was calculated by averaging ratings 
across Activities 1 through 3. Each interven¬ 
tionist’s fidelity scores were averaged across 
the three observation occasions. Observers 
also provided a holistic rating of overall level 
of implementation on a 7-point scale, with a 
score of 1 representing low implementation 
and 7 representing high implementation. 

Ratings of Classroom Management 
and Instructional Support. Ratings of 
Classroom Management and Instructional 
Support (RCMIS; Doabler & Nelson-Walker, 
2009) is a holistic rating system composed 
of 14 items (e.g., clear and consistent delivery 
of instruction) that measure the quality of in¬ 
structional interactions that take place between 
teachers and students around critical mathe¬ 
matics content (Cronbach’s a = 0.92). Each 
curriculum-independent item is rated on a 
4-point scale from low (1) to high (4). For 
each observation, a score was calculated by 
averaging the ratings across the 14 items. For 
each group, an overall quality score was cal¬ 
culated as the mean across all observations. 
The RCMIS was used as a measure of overall 
instructional quality. 

Early Numeracy Curriculum-Based 
Measures. Early Numeracy Curriculum- 
Based Measures (EN-CBM; Clarke & Shinn, 
2004) was used as a proximal measure of 
students’ procedural fluency. All measures 
were timed for 1 min. The Oral Counting 
measure requires students to orally rote count 
as high as possible without making an error. 


Concurrent and predictive validities range 
from 0.46 to 0.72. For all EN-CBM measures, 
the criterion measures were the Number 
Knowledge Test (Okamoto & Case, 1996), 
Woodcock-Johnson Applied Problems subtest 
(Woodcock & Johnson, 1989), and Mathemat- 
ics-CBM (Shinn, 1989). The predictive-valid¬ 
ity timeframe was from the fall to the spring. 
The Number Identification measure requires 
students to orally identify numbers between 0 
and 10 when presented with a set of printed 
number symbols. Concurrent and predictive 
validities range from 0.62 to 0.65. The QD 
measure requires students to name which of 
two visually presented numbers between 0 
and 10 is greater. Concurrent and predictive 
validities range from 0.64 to 0.72. The MN 
measure requires students to name the missing 
number from a string of numbers (0-10). Con¬ 
current and predictive validities range 
from 0.46 to 0.63. A total EN-CBM score, 
calculated by summing raw scores from the 
four subtests, was used in the analysis. Pre¬ 
liminary evidence indicates the measures’ ca¬ 
pability to monitor growth (Clarke & Shinn, 
2004; Clarke et al„ 2008). 

Group curriculum-based measure. 

Two of the individually administered EN- 
CBM (Clarke & Shinn, 2004) measures, QD 
and MN, were adapted for small-group admin¬ 
istration and used as a screening instrument. 
Whereas the original measures require stu¬ 
dents to verbally respond to each item, the 
group curriculum-based measure has them 
write their responses (circling the correct 
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choice or filling in the missing number). The 
test-retest reliabilities of the group-adminis¬ 
tered QD and MN measures are 0.87 and 0.85, 
respectively. Concurrent and predictive valid¬ 
ities with the ProFusion assessment range 
from 0.58 to 0.80 for the QD measure and 
from 0.42 to 0.57 for the MN measure 
(Doabler et al., in press). 

Stanford Achievement Test, 10th 
Edition. The Stanford Achievement Test, 
10th Edition (SAT-10; Harcourt Educational 
Measurement, 2002), is a group-administered, 
norm-referenced examination for kindergarten 
through twelfth-grade students. Two math 
subtests were used as distal measures of math¬ 
ematics performance. The Math Problem 
Solving subtest assesses problem solving and 
mathematical reasoning. The Math Procedures 
subtest assesses computational fluency. The 
SAT-10 is a standardized achievement test 
with reliability estimates that exceed 0.90 and 
a criterion-related validity coefficient of ap¬ 
proximately 0.60 to 0.70 (Harcourt Educa¬ 
tional Measurement, 2002). 

ProFusion. The ProFusion measure 
was developed by the research team to assess 
students’ conceptual and procedural knowl¬ 
edge of number and numeration, place-value 
concepts, basic number combinations, and 
problems involving multidigit addition and 
subtraction. In an untimed, group setting, stu¬ 
dents are asked write numbers from dictation 
(four items) and numbers missing from a se¬ 
quence (three items), write numbers matching 
base-10 block models (three items), and de¬ 
compose double-digit numbers (three items). 
Moreover, students complete addition prob¬ 
lems and subtraction problems (eight items) 
and story problems (two items). Students also 
complete 1-min, timed addition (32 items pos¬ 
sible) and subtraction (24 items possible) flu¬ 
ency measures and work with proctors indi¬ 
vidually to complete the number-identification 
portion (8 items). The criterion validity with 
other posttest measures used in the study was 
r = 0.56 with the EN-CBM total score and 
r = 0.68 with the SAT-10. 

The measurement net for the study was 


designed to represent the theory of change 
outlined in the introduction section. The Pro- 
Fusion measure functioned as a measure of 
proximal conceptual understanding, and EN- 
CBM measures were selected to function as a 
proximal measure of procedural fluency. The 
SAT-10 measure was used as a distal measure 
of mathematics achievement. The RCMIS and 
fidelity-of-implementation measures were 
used to examine overall instructional quality 
and teacher-student interactions. 

Procedures 

Data collection. Prior to beginning 
data collection, data collectors with experi¬ 
ence in conducting educational assessments 
for research projects attended 2 days of train¬ 
ing. Data collectors for the EN-CBM mea¬ 
sures and the SAT-10 were not affiliated with 
the project in any other manner (e.g., interven¬ 
tionists, authors of this article). Fusion inter¬ 
ventionists administered the ProFusion assess¬ 
ment after a half day of training. During train¬ 
ing on individually administered assessments, 
data collectors were shadow scored on a prac¬ 
tice administration and held to a 90% inter¬ 
scorer reliability standard. A fidelity checklist 
(e.g., reads directions as standardized) was 
used for all measures to ensure reliable admin¬ 
istration. Similar procedures were followed 
during data collection in the field, with 
shadow scoring to a criterion of 90% inter¬ 
scorer reliability on individually administered 
measures and the use of fidelity checklists on 
all measures. Once data were collected, all 
protocols were double scored and double en¬ 
tered by two data collectors. All first-grade 
students completed the group QD and MN 
screeners approximately 1 month before the 
start of the intervention. Participating students 
completed the EN-CBM, SAT-10, and ProFu¬ 
sion at pretest before the start of Fusion in¬ 
struction in their schools. After Fusion instruc¬ 
tion ended, participants completed the EN- 
CBM, SAT-10, and ProFusion at posttest. 
Pretest data were collected in the 2 weeks 
before the start of the intervention, and post¬ 
test data were collected during a 2-week win¬ 
dow after the intervention was completed. 
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Trained project staff observed each 
group’s Fusion instruction three times (i.e., 
once during the beginning, middle, and end 
of the curriculum). Observers completed the 
RCMIS during the same observations as the 
fidelity-of-implementation instrument. 
RCMIS interobserver reliability assessment 
was conducted on 20% of all observation oc¬ 
casions, and RCMIS interobserver reliability 
was 91%. The reliability of the RCMIS was 
calculated by summing the total points by each 
observer and then dividing the smaller sum by 
the larger sum. Fidelity-of-implementation in¬ 
terobserver reliability assessment was con¬ 
ducted on 20% of all observations. Exact 
agreement (e.g., 100% reliability was found 
when both observers scored an activity with 
the same rating and 0% reliability indicated 
different ratings) was used to calculate reli¬ 
ability. Interobserver reliability was 95% and 
86% for the activity-based rating and holistic 
rating, respectively. 

Fusion intervention. The Fusion cur¬ 
riculum is a Tier 2 Grade 1 mathematics in¬ 
tervention designed for students at risk in 
whole-number concepts and skills. Students 
are taught in small groups of approximately 
five students and receive 60 lessons, each last¬ 
ing 30 min, delivered over a period of 20 
weeks. In this study, on average, the interven¬ 
tion lasted 18.6 weeks and teachers deliv¬ 
ered 50.7 lessons. 

Each lesson includes the explicit intro¬ 
duction of new content and systematic practice 
and review in four to five brief, scripted math¬ 
ematics activities. Lessons use a variety of 
math models and contain teacher modeling, 
scaffolded instructional examples, and oppor¬ 
tunities for teachers to provide academic feed¬ 
back based on student responses to individual 
and group questions. Two mathematical do¬ 
mains in the hrst-grade Common Core State 
Standards—Operations and Algebraic Think¬ 
ing and Number and Operations in Base 
Ten—form the basis of Fusion content. The 
first half of the curriculum emphasizes number 
sense, basic number combinations, and place- 
value concepts. During the second half of the 
curriculum, students encounter multidigit 


computation without regrouping and word- 
problem solving. In this study, interventionists 
were given guidelines to deliver one lesson per 
day, three times per week, in small-group in¬ 
structional formats, with approximately five 
students per group. 

Professional development. Interven¬ 
tionists participated in two 3-hr professional 
development workshops led by the authoring 
and research team. Workshops were intended 
to deepen content knowledge for teaching 
mathematics, pedagogical knowledge, and 
comfort teaching Fusion lessons. Workshops 
provided time to practice teaching Fusion les¬ 
sons and receive feedback from the interven¬ 
tionists’ peers and the curriculum’s authors. 
The first workshop was conducted approxi¬ 
mately 1 month before Fusion instruction. 
Content included an overview of the study 
design and the interventionists’ role, an over¬ 
view of the Fusion intervention and its under¬ 
lying principles and content, lesson demon¬ 
strations, group management tips, and practice 
opportunities. The second training workshop 
occurred after interventionists had imple¬ 
mented approximately one quarter of the Fu¬ 
sion lessons. During this training, interven¬ 
tionists had the opportunity to ask questions 
about the hist half of the curriculum and were 
introduced to concepts in the second half of 
the curriculum. There was no set standard that 
interventionists were required to meet during 
training prior to implementation. 

Statistical Analysis 

A series of random-effects models were 
estimated using the SPSS MIXED procedure 
to compare gains in ProFusion, EN-CBM, and 
SAT-10 outcomes between the treatment and 
control conditions. Raw scores were used in 
the analysis. The random-effects models 
nested pretest and posttest assessments within 
students and students within instructional 
groups. The models included the effects of 
time (coded 0 for pretest and 1 for posttest), 
condition of instructional group (coded 0 for 
control and 1 for treatment), and the Condi- 
tion-by-Time interaction. The Condition-by- 
Time interaction represents the difference in 
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gains in outcomes between the two groups. 
Hedges’s g was used as a metric of interven¬ 
tion effect size for each outcome (0.2, 0.5, 
and 0.8 are considered small, medium, and 
large effects, respectively; WWC, 2011). As 
recommended by Feingold (2009), Hedges’s g 
was computed as the Condition-by-Time in¬ 
teraction effect divided by the posttest pooled 
standard deviation of the outcome. In accor¬ 
dance with an intent-to-treat approach, maxi¬ 
mum likelihood estimation was used to obtain 
model parameters and standard errors using all 
cases available, which results in less bias in 
parameter estimates and standard errors than 
other methods of handling missing data (e.g., 
listwise deletion; Schafer & Graham, 2002). 

A second set of analyses was conducted 
to examine the relationships between (a) gains 
in student outcomes from pretest to posttest 
and (b) fidelity of Fusion implementation 
across Activities 1 through 3 and quality-of- 
instruction (RCMIS) ratings averaged across 
the three observation occasions. These analy¬ 
ses involved a series of random-effects models 
nesting students assigned to the treatment con¬ 
dition within Fusion instructional groups. As¬ 
sociations were estimated by regressing gain 
scores for each outcome on each observation 
measure separately. We report standardized 
parameter estimates (Snijders & Bosker, 
1999). 

Results 

Baseline Equivalency and Attrition 

The expectation of baseline equivalency 
owing to random assignment of groups was 
examined. The treatment and control groups 
were compared regarding demographic char¬ 
acteristics and outcome measures collected at 
pretest. Contingency-table analyses and t tests 
were conducted on categorical and continuous 
measures, respectively. The groups did not 
significantly differ on any demographic char¬ 
acteristics or pretest outcome measures (Table 
1). The extent to which attrition threatened the 
internal and external validity of the study was 
evaluated using contingency-table analyses 
and analysis of variance. Participants who 
completed all posttest assessments were com¬ 


pared with those who did not with respect to 
demographic characteristics and pretest out¬ 
come measures. We also conducted 2-way 
analyses of variance to test whether outcome 
variables were differentially affected across 
conditions by attrition. These latter analyses 
examined the effects of condition and attrition 
status, as well as their interaction, on pretest 
outcomes. Among the 45 students assigned to 
the Fusion condition and the 44 control stu¬ 
dents, the attrition rates were 13.3% (n = 6) 
and 13.6% (n = 6), respectively. The attrition 
rates did not significantly differ by condition. 
We found no statistically significant differ¬ 
ences in demographic characteristics or base¬ 
line outcomes by attrition status nor did we 
find any statistically significant interactions 
between attrition and condition predicting 
baseline outcomes, suggesting that attrition 
was not systematic. 

Intervention Effects for Fusion 

Table 2 provides descriptive statistics 
and intervention effects for each outcome 
measure. The treatment group had statistically 
significantly greater gains on our proximal 
measure of conceptual understanding, ProFu- 
sion, compared with control participants (esti¬ 
mate = 12.9, p = .015, Hedges’s g = 0.82), 
corresponding to a large effect. The difference 
between groups was not statistically signifi¬ 
cant with respect to gains on our proximal 
measure of procedural fluency, EN-CBM (es¬ 
timate = 7.8,/? = .667, Hedges’s g = 0.14), or 
on scores on our distal measure of conceptual 
understanding, SAT-10 (estimate = 1.1,/? = 
.590, Hedges’s g = 0.11). 

Fidelity of Implementation, Quality of 
Instruction, and Student Performance 

Table 3 provides descriptive statistics 
for the fidelity of implementation across Ac¬ 
tivities 1 through 3 and quality-of-instruction 
ratings averaged across the three observation 
occasions. To serve as a proxy for our hypoth¬ 
esized mediator, teacher-student interactions, 
associations between these measures and gains 
in student outcomes are also summarized in 
Table 3, which summarizes standardized pa- 
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Table 2 

Pretest and Posttest Descriptive Statistics and Condition-by-Time 
Intervention Effects for Outcome Measures 


Measure 

Pretest 


Posttest 


Condition-by-Time Intervention Effect 

M 

SD 

n 

M 

SD 

n 

Estimate 

t 

P 

Hedges’s g 

ProFusion 







12.9 

2.71 

.015 

0.82 

Fusion 

23.0 

11.2 

44 

53.1 

15.1 

38 





Control 

27.2 

13.4 

42 

44.0 

16.3 

39 





EN-CBM 







7.8 

0.44 

.667 

0.14 

Fusion 

143.0 

42.1 

44 

183.2 

61.8 

40 





Control 

148.8 

45.9 

44 

182.2 

52.1 

41 





SAT-10 







1.1 

0.55 

.590 

0.11 

Fusion 

22.6 

6.5 

44 

33.4 

10.4 

38 





Control 

23.2 

7.3 

43 

32.5 

9.9 

40 






Note. Tests of the Condition-by-Time interaction used 16 df. EN-CBM = Early Numeracy Curriculum-Based Measure; 
SAT-10 = Stanford Achievement Test, 10th Edition. 


rameter estimates ((3s). Although none of the 
relationships were statistically significant 
(p > .15 for all tests), moderate to large 
positive associations were found between 


gains in (a) ProFusion outcomes and imple¬ 
mentation of Activities 1 through 3 ((3 
= 0.21), overall implementation fidelity ((3 = 
-0.20), teachers providing models ((3 = 


Table 3 

Descriptive Statistics for Fidelity of Implementation and Quality of 
Instruction Ratings and Their Associations With Student Outcomes 


Associations With Student 
Outcome ((3) 


Measure 

M (SD) 

Range 

ProFusion 

EN-CBM 

SAT-10 

Fidelity of implementation 3 






Implemented Activities 1 through 3 

0.9 (0.1) 

0.8-1.0 

0.21 

-0.39 

0.04 

Modeled skill or concept 

0.9 (0.2) 

0.6-1.0 

0.50 

-0.23 

0.26 

Provided group response opportunities 

0.9 (0.2) 

0.5-1.0 

-0.05 

0.13 

-0.21 

Provided individual turns 

0.9 (0.1) 

0.7-1.0 

0.32 

-0.23 

0.16 

Provided academic feedback 

0.9 (0.1) 

0.6-1.0 

-0.08 

0.04 

-0.19 

Overall b 

5.2 (1.1) 

3.3-6.3 

-0.03 

-0.20 

-0.16 

Quality of instruction 3 

3.2 (0.6) 

2.4-3.7 

-0.04 

-0.09 

-0.26 


Note. Items and summary scores were averaged across three observation occasions. Tests of associations (fixed effects) 

used 7 df; with p > .15 for all tests. EN-CBM = Early Numeracy Curriculum-Based Measure; SAT-10 = Stanford 

Achievement Test, 10th Edition. 

a Items were rated as 0 (no), 0.5 (partially), or 1 (yes). 

b Overall fidelity of implementation was rated from 1 (low) to 7 (high). 

c Quality-of-instruction items were rated from 1 (not present) to 4 (highly present). 
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-0.23), and teachers providing individual 
turns ((3 = -0.23) and (b) SAT-10 outcomes 
and modeling ((3 = 0.26). Moderate negative 
associations were found between gains in (a) 
EN-CBM outcomes and implementation of 
Activities 1 through 3, modeling, providing 
individual turns, and overall fidelity of imple¬ 
mentation ((3 = -0.39, (3 = -0.23, |3 = -0.23, 
and (3 = -0.20, respectively) and (b) SAT-10 
outcomes and providing group response op¬ 
portunities and quality of instruction ((3 = 
-0.21 and (3 = -0.26, respectively). 

Discussion 

We examined the impact of a Tier 2 
first-grade intervention program targeting crit¬ 
ical whole-number content. We hypothesized 
that at-risk students in the intervention condi¬ 
tion would show greater gains than their at- 
risk peers in the control condition. The results 
from this study provide partial support for our 
primary hypothesis. On a proximal measure 
assessing conceptual understanding of whole- 
number content, ProFusion, students showed 
statistically significantly greater gains and a 
large effect (WWC, 2011). Results on a prox¬ 
imal measure of procedural fluency, EN- 
CBM, and a distal measure of conceptual un¬ 
derstanding, SAT-10, were not statistically 
significant, but both showed small positive 
effect sizes. The WWC (2011) provides a clas¬ 
sification system to generate an overall de¬ 
scriptor of results when a student has multiple 
measures. If a study has one statistically sig¬ 
nificant positive result and the other results in 
the study are nonsignificant but show positive 
effect sizes, the overall results for the study 
are described as having a statistically signifi¬ 
cant positive impact on student outcomes. Al¬ 
though there are a limited number of random¬ 
ized controlled trials focused on early mathe¬ 
matics (Dyson, Jordan, & Glutting, 2013), the 
pattern of results found in this study is similar 
to results found in other studies of comprehen¬ 
sive intervention programs with first-grade 
students in which overall positive results were 
found with greater impacts on proximal mea¬ 
sures of achievement (Bryant, Bryant, Ger- 
sten, Scammacca, Funk, et al., 2008; Bryant et 


al., 2011; Fuchs et ah, 2005; Rolfhus et ah, 
2012). For example, Fuchs et ah (2005) found 
effect sizes up to 0.7 on proximal measures of 
achievement, but in a replication study of the 
same program, Rolfhus et ah (2012) found an 
effect size of 0.34 on a distal measure of 
achievement. 

Regarding our second research hypoth¬ 
esis that higher levels of implementation fidel¬ 
ity would be associated with greater student 
gains, an analysis of the association between 
implementation variables and student out¬ 
comes did not show significant results. This 
analysis was designed to serve as a proxy for 
the potential role of teacher-student interac¬ 
tions functioning as a mediator of student out¬ 
comes. The pattern of nonsignificant mixed 
results that were found across associations 
makes it difficult to draw conclusions concern¬ 
ing our hypothesis that greater levels of im¬ 
plementation quality would be positively as¬ 
sociated with student achievement gains. 

Limitations 

A number of considerations are impor¬ 
tant when interpreting the results of this study. 
First, because the study is an initial pilot study, 
the sample is restricted by geographic location 
and the demographic characteristics of the 
study sample are not representative of the na¬ 
tional population of first-grade students. In 
addition, because of the small sample size, the 
power to detect treatment impact is limited. 
Sufficient power may not have been present to 
detect small positive trends in the data on the 
EN-CBM and SAT-10 measures. Despite 
these limitations, there is some preliminary 
support, on the basis of the overall findings on 
student outcomes measures, that Fusion had a 
positive impact on student achievement. 

A critical key in supporting our theory 
of change is whether teacher and student be¬ 
haviors mediate achievement. However, be¬ 
cause implementation data were not collected 
in control classrooms, we were not able to 
directly examine mediation. To navigate this 
barrier, we examined implementation fidelity 
based on the hypothesis that teachers who 
implemented Fusion with greater fidelity 
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would have greater rates of critical behaviors 
(e.g., providing models of math concepts) than 
teachers who implemented with lower levels 
of fidelity. Thus, although we present fidelity- 
of-implementation data as an attempt to ex¬ 
plore the role of mediation, it is a limitation of 
the study that formal mediation was not 
conducted. 

Our exploratory analysis found no sig¬ 
nificant results supporting our hypothesis that 
higher levels of implementation quality would 
mediate student outcomes. In part, this may 
have been because of overall high levels of 
implementation across groups, and thus range 
restriction may have attenuated the magnitude 
of those relationships. The mean score on each 
of the five implementation variables examined 
was at least 0.9 (on a scale of 0-1, with 1 
representing full implementation) or higher, 
and the largest standard deviation was 0.2. In 
other words, because all teachers implemented 
Fusion to a relatively high level, there was a 
lack of variability in measuring implementa¬ 
tion quality. The lack of variability may have 
contributed to the interesting pattern of nega¬ 
tive associations between implementation fi¬ 
delity and the EN-CBM and SAT-10, findings 
that are counterintuitive. That is, greater im¬ 
plementation was associated with lower out¬ 
comes on those measures. In part, this may be 
because of the small sample size of the study 
and the fact that a small sample size may have 
limited the stability of estimates. Another pos¬ 
sibility is that there may have been poor align¬ 
ment between the EN-CBM and SAT-10, 
measures and the content of the intervention. 
The EN-CBM measures were designed to be a 
proximal measure of procedural fluency, but 
the fluency focus of the Fusion intervention 
was aimed at fluency with basic facts and not 
the skills directly assessed by the EN-CBM 
measures (e.g., number identification and 
magnitude comparison). The same concern 
holds for the SAT-10, which was designed as 
a distal measure of achievement. However, the 
SAT-10 included content such as geometry 
and measurement concepts that were not a 
direct focus of the Fusion intervention. We do 
caution that because the results were nonsig¬ 


nificant, any interpretation of the results and 
potential causes is speculative. 

Implications for Practice and Future 
Research 

Examining results from this study and 
other research studies on early mathematics 
intervention within a framework provided by a 
theory-of-change model offers insights to 
guide both research and practice. Given that 
the results presented were generated from a 
pilot study, caution should be implied when 
interpreting results and discussing implica¬ 
tions for practice. The results from this study 
and, in particular, the overall positive impact 
results on student outcomes offer a starting 
point for schools to consider the use of Fusion 
as one potential tool within an Rtl service 
delivery model. Additional evidence is needed 
to warrant a definitive statement on whether 
schools should implement Fusion. One impor¬ 
tant consideration when evaluating the student 
outcome results from this study is how the 
results fit within a pattern of curriculum design 
and research findings in early mathematics. 
Across an array of early mathematics interven¬ 
tion programs (e.g., Bryant et ah, 2011; Dy¬ 
son, Jordan, & Gluting, 2013; Fuchs et ah, 
2005, Sood & Jitendra, 2013), there is a gen¬ 
eral trend by researchers who develop curri¬ 
cula toward building curricula based on the 
two intervention components that provide the 
theoretical foundation for Fusion—a focus on 
whole-number content and an explicit and sys¬ 
tematic instruction approach. Given that these 
approaches mirror recommendations from in¬ 
dividual experts (Gersten et ah, 2009; Mil- 
gram & Wu, 2005) and national panels 
(NMAP, 2008), it is reasonable to suggest that 
schools should actively look for intervention 
programs with a similar focus given that cur¬ 
rent programs may not meet these recommen¬ 
dations. Schools may need to provide support 
to educators in the classroom as they imple¬ 
ment current intervention programs lacking 
these features. For example, schools can sup¬ 
port the collection of observation data focused 
on key instructional behaviors and link obser¬ 
vation findings to teacher coaching and pro- 
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fessional development in critical areas such as 
providing students opportunities to verbalize 
their mathematical thinking (Clarke et ah, 
2008). This is a role well suited for school 
psychologists or other school professionals 
with expertise in conducting classroom obser¬ 
vations and instructional design. In addition, 
as new intervention programs are developed 
and made available, educators looking to 
adopt programs should ensure that their re¬ 
view process puts particular emphasis on ex¬ 
amining whether programs under consider¬ 
ation provide focused content coverage and an 
explicit and systematic instructional design 
approach. 

Although there is general consensus that 
curricula should contain the two aforemen¬ 
tioned components, future research should 
take a more fine-grained approach to tease out 
and manipulate specific program variables. 
One line of research in this vein examines the 
concept of instructional intensity (Warren, 
Fey, & Yoder, 2007) by focusing on specific 
instructional design manipulations that vary 
the intensity of the instructional experience for 
the student. For example, Bryant et al. (2011), 
extending a line of research working with at- 
risk hist graders (Bryant, Bryant, Gersten, 
Scammacca, & Chavez, 2008; Bryant, Bryant, 
Gersten, Scammacca, Funk et al., 2008), 
have—over multiple iterations of their pro¬ 
gram—focused on increasing instructional in¬ 
tensity by expanding the amount of instruc¬ 
tional time required as part of the intervention. 
Although increasing instructional intensity can 
be accomplished by varying delivery parame¬ 
ters such as group size and number of lessons, 
instructional intensity can also be increased by 
manipulating elements embedded within inter¬ 
vention programs such as the number of mod¬ 
els provided by the teacher or individual re¬ 
sponse opportunities for students. 

To address the issue of examining in¬ 
structional intensity as a potential mediating 
variable, one possible remedy would be to use 
a more robust observational system across 
treatment and control classrooms. For exam¬ 
ple, in a recent efficacy trial of a kindergarten 
mathematics program, our research team con¬ 
ducted approximately 400 observations in 129 


kindergarten classrooms using a low-inference 
observation instrument called the Classroom 
Observations of Student-Teacher Interactions- 
Mathematics (COSTI-M; Doabler et al., in 
press). We were particularly interested in us¬ 
ing the COSTI-M to examine the relationship 
between the rate of explicit instructional inter¬ 
actions and student mathematics achievement 
(Doabler et al., in press). Specifically, the 
COSTI-M allowed us to capture three key 
components of instructional interactions hy¬ 
pothesized to potentially mediate student 
mathematics achievement: explicit teacher 
demonstrations, student practice opportunities, 
and timely academic feedback. A key finding 
from the study is that students in classrooms 
with higher rates of practice opportunities 
made substantively important gains in critical 
mathematics outcomes. Further research on 
Fusion and other programs using a similar 
observation system framework would allow a 
more robust examination of potential media¬ 
tion variables and shed light on the theories of 
change underlying different programs. 

Fuchs et al. (2013) have conducted a 
line of longitudinal research focused on devel¬ 
oping and investigating mathematics interven¬ 
tions across the early elementary grades. This 
line of research offers a number of valuable 
insights. Although the intervention programs 
studied were effective as measured by tradi¬ 
tional analytic approaches, they were not uni¬ 
versally effective for all students in two criti¬ 
cal ways. First, although some students re¬ 
sponded to the program, the impact on 
achievement was not great enough to fully 
reduce the achievement gap between at-risk 
students and not at-risk peers. This finding 
mirrors similar results from other studies of 
curriculum programs (e.g., Clarke et al., 2011) 
and the general difficulty in fully reducing 
achievement gaps (Starkey & Klein, 2008). 
Second, and even more critically, despite the 
provision of research-based instruction, there 
remained a subgroup of students who did not 
respond to the intervention. One potential way 
to increase the efficacy of instructional inter¬ 
ventions for students who do not respond is to 
modify programs based on a theory-of-change 
model. For example, to increase the efficacy of 
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a preexisting intervention program, Fuchs et 
al. (2010) modified a portion of the interven¬ 
tion program focused on increasing procedural 
fluency with mathematic facts, an area of par¬ 
ticular difficulty and critical importance for 
students with mathematics learning disabilities 
(Geary, 2004). One set of modifications was 
linked to a theory of change based on the 
potential moderating role of what the authors 
termed domain general abilities. Specifically, 
Fuchs et al. (2013) structured the math fact 
practice time to “compensate for . . . potential 
weaknesses in the domain general abilities 
associated with difficulty with math facts: in¬ 
attentive behavior, processing speed, phono¬ 
logical processing, working memory, and rea¬ 
soning ability” (p. 260). In part, these types of 
systematic manipulations are an inherent and 
valuable part of design science (Brown, 1992; 
Cobb, Confrey, diSessa, Lehrer, & Schauble, 
2003) and the construction of research-vali¬ 
dated curricula (Clements, 2007). 

These efforts illustrate that multiple per¬ 
spectives and theories of changes are inform¬ 
ing the development and research of mathe¬ 
matics curricula. It is not to suggest that re¬ 
search teams investigating a specific theory of 
change and corresponding mediator and mod¬ 
erator variables fail to acknowledge the role of 
other potential mediators and moderators but, 
rather, specific research studies and lines of 
research may focus on delving deeply and 
systematically into a specific component of a 
broader theory-of-change model. Thus, as a 
whole, researchers should be prepared to as¬ 
similate findings from these various research 
lines into more robust theories of change that 
enhance the overall quality of curriculum de¬ 
velopment efforts. A number of next steps are 
vital to extend the research on Fusion. Fore¬ 
most is the need to collect implementation 
data and teacher-student interaction data 
across a condition to allow true mediation 
analysis. Second, greater attention should be 
paid to ensuring that the student outcome mea¬ 
sures more closely align with the underlying 
theory of change. This could include a proxi¬ 
mal measure of procedural fluency focused on 
basic number combination fluency and a distal 
measure of conceptual understanding with an 


emphasis on whole-number concepts and un¬ 
derstanding. Lastly, given that this pilot study 
had a small sample size, subsequent studies 
should examine Fusion with larger sample 
sizes and test Fusion across an array of differ¬ 
ent geographic locations and sample demo¬ 
graphic characteristics to increase the general- 
izability of results. 

Conclusion 

Given the critical importance of a suc¬ 
cessful start in mathematics (Hanich, Jordan, 
Kaplan, & Dick, 2001) and the need for effec¬ 
tive intervention programs for use with tiered 
models of instruction (Gersten et al., 2009), 
the importance of researchers developing, in¬ 
vestigating, and modifying theory-of-change 
models is paramount. Through individual and 
collective efforts to do so, the research field 
has the opportunity to contribute greatly to the 
quality of mathematics instruction provided in 
our nation's schools as we attempt to provide 
all students with a strong foundation in math¬ 
ematical understanding. 
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